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Abstract 

We consider the problem of finding semi-matching in bipartite graphs which is also exten- 
sively studied under various names in the scheduling literature. We give faster algorithms for 
both weighted and unweighted cases. 

For the weighted case, we give an 0(nm log n)-time algorithm, where n is the number of 
vertices and m is the number of edges, by exploiting the geometric structure of the problem. 
This improves the classical 0(n 3 )-time algorithms by Horn [Operations Research 1973] and 
Bruno, Coffman and Sethi [Communications of the ACM 1974]. 

For the unweighted case, the bound could be improved even further. We give a simple divide- 
and-conquer algorithm which runs in 0{\/nm\ogn) time, improving two previous 0(nm)-time 
algorithms by Abraham [MSc thesis, University of Glasgow 2003] and Harvey, Ladner, Lovasz 
and Tamir [WADS 2003 and Journal of Algorithms 2006] . We also extend this algorithm to solve 
the Balance Edge Cover problem in 0(y/nm log n) time, improving the previous 0(nm)-time 
algorithm by Harada, Ono, Sadakane and Yamashita [ISAAC 2008]. 

Categories and Subject Descriptors: G.2.2 [Graph Theory]— Graph algorithms 
General Terms: Algorithms, Theory 

Additional Key Words and Phrases: Semi-Matching, Scheduling, Combinatorial Optimization, 
Design and Analysis of Algorithm 

1 Introduction 

In this paper, we consider a relaxation of the maximum bipartite matching problem called semi- 
matching problem, in both weighted and unweighted cases. This problem has been previously 
studied in the scheduling literature under different names, mostly known as (non-preemptive) 
scheduling independent jobs on unrelated machines to minimize flow time, or R\ | ^ Cj in the 
standard scheduling notation [31 [26j 12] • 

Informally, the problem can be explained by the following off-line load balancing scenario: We 
are given a set of jobs and a set of machines. Each machine can process one job at a time and it 
takes different amounts of time to process different jobs. Each job also requires different processing 
times if processed by different machines. One natural goal is to have all jobs processed with the 



*A preliminary version of this paper appeared in ICALP'10 [llj . 

^Most of the work was done while all authors were at Kasetsart University. 
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minimum total completion time, or total flow time, which is the summation of the duration each 
job has to wait until it is finished. Observe that if the assignment is known, the order each machine 
processes its assigned jobs is clear: It processes jobs in an increasing order of the processing time. 

To be precise, the semi-matching problem is as follows. Let G = (U U V, E) be a weighted 
bipartite graph, where U is a set of jobs and V is a set of machines. For any edge uv, let w uv be its 
weight. Each weight of an edge uv indicates time it takes v to process u. Through out this paper, 
let n denote the number of vertices and m denote the number of edges in G. A set M C E is a 
semi-matching if each job u £ U is incident with exactly one edge in M. For any semi-matching 
M, we define the cost of M, denoted by cost(M), as follows. First, for any machine v S V, its cost 
with respect to a semi-matching M is 

COStjtf(v) = (Wl) + {w\ + U) 2 ) + ■ ■ ■ + (tl>l + ■ ■ ■ + U>deg M (i>)) = ^ ( de SM( u ) ~ i + I) • W{ 

i=l 

where deg M {v ) is the degree of v in M and w\ < w 2 < . . . < £i>deg M («) are weights of the edges in M 
incident with v sorted increasingly. Intuitively, this is the total completion time of jobs assigned to 
v. Note that for the unweighted case (i.e., when w e = 1 for every edge e), the cost of a machine v is 
simply deg M (v) ■ (deg M (v) + l)/2. Now, the cost of the semi-matching M is simply the summation 
of the cost over all machines: 

cost(M) = costjyf (v). 

The goal is to find an optimal semi-matching, a semi-matching with minimum cost. 

Previous works: Although the name "semi-matching" was recently proposed by Harvey, Ladner, 
Lovasz, and Tamir |19| . the problem was studied as early as 1970s when an 0(n 3 ) algorithm was 
independently developed by Horn in |20| and by Bruno, Coffman and Sethi in [6]. Since then 
no progress has been made on this problem except on its special cases and variations. For the 
special case of inclusive set restriction where, for each pair of jobs u\ and ui, either all neighbors 
of u\ are neighbors of u 2 or vice versa, a faster algorithm with 0(n 2 ) running time was given by 
Spyropoulos and Evans |40| . Many variations of this problem were proved to be NP-hard, including 
the preemptive version [39], the case when there are deadlines [41], and the case of optimizing total 
weighted tardiness |29| . The variation where the objective is to minimize max^gy costM(^) was 
also considered [32j EH] . 

The unweighted case of the semi-matching problem also received considerably attention in the 
past few years. Since it was shown by [19] that an optimal solution of the semi-matching problem 
is also optimal for the makespan version of the scheduling problem (where one wants to minimize 
the time the last machine finishes), we mention the results of both problems. The problem was 
first studied in a special case, called nested case where, for any two jobs, if their sets of neighbors 
are not disjoint, then one of these sets contains the other set. This case is shown to be solvable in 
0(m + nlogn) time [36, p. 103]. For the general unweighted semi-matching problem, Abraham [Tj 
Section 4.3] and Harvey, Ladner, Lovasz and Tamir [19] independently developed two algorithms 
with Oinva) running time. Lin and Li [28] also gave an 0(n 3 log n)-time algorithm which is later 
generalized to a more general cost function [27]. Recently, Lee, Leung and Pinedo [24J showed that 
the problem can be solved in polynomial time even when there are release times. 

The unweighted semi-matching problem is recently generalized to the quasi-matching problem 
by Bokal, Bresar and Jerebic [3]. In this problem, a function g is provided and each vertex u G U 
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is required to connect to at least g(u) vertices in v. Therefore, the semi-matching problem is 
when g(u) = 1 for every u £ U. They also developed an algorithm for this problem which is a 
generalization of the Hungarian method and used it to deal with a routing problem in CDMA-based 
wireless sensor networks. 

Motivated by the problem of assigning wireless stations (users) to access points, the unweighted 
semi-matching problem is also generalized to the problem of finding optimal semi-matching with 
minimum weight where an 0(n 2 m) time algorithm is given |15j . 

Approximation algorithms and online algorithms for this problem (both weighted and un- 
weighted cases) and the makespan version have also gained a lot of attention over the past few 
decades and have applications ranging from scheduling in hospital to wireless communication net- 
work. (See [26] 148] for the recent surveys.) 

Applications: As motivated by Harvey et al. [19], even in an online setting where jobs arrive and 
depart over time, they may be reassigned from one machine to another cheaply if the algorithm's 
runtime is significantly faster than the arrival/departure rate. (One example of such case is the 
Microsoft Active Directory system [141 H9|-) The problem also arose from the Video on Demand 
(VoD) systems where the load of video disks needs to be balanced while data blocks from the disks 
are retrieved or while serving clients [31] 145] . The problem, if solved in the distributed setting, can 
be used to construct a load balanced data gathering tree in sensor networks [37], [33] . The same 
problem also arose in peer-to-peer systems [4"2 ] [23 ] |4"3]. 

In this paper, we also consider an "edge cover" version of the problem. In some applications 
such as sensor networks, there are no jobs and machines but the sensor nodes have to be clustered 
and each cluster has to pick its own head node to gather information from other nodes in the 
cluster. Motivated by this, Harada, Ono, Sadakane and Yamashita [16] introduced the balanced 
edge cover proble where the goal is to find an edge cover (set of edges incident to every vertex) 
that minimizes the total cost over all vertices. (The cost on each vertex is as previously defined.) 
They gave an 0(nm) algorithm for this problem and claimed that it could be used to solve the 
semi-matching problem as well. We show that this problem can be efficiently reduced to the semi- 
matching problem. Thus, our algorithm (for unweighted case) also gives a better bound on the 
balanced edge cover problem. 

Our results and techniques 

We consider the semi-matching problem and give a faster algorithm for each of the weighted and 
unweighted cases. We also extend the algorithm for the unweighted case to solve the balanced edge 
cover problem. 

• Weighted Semi-Matching: (Section [2|) We present an 0{nm log n) algorithm, improving 
the previous 0(n 3 ) algorithm by Horn [20] and Bruno et al. [6]. As in the previous results [20, 
[17] , we use the reduction of the weighted semi- matching problem to the weighted bipartite 
matching problem as a starting point. We, however, only use the structural properties arising 
from the reduction and do not actually perform the reduction. 

• Unweighted Semi-Matching: (Section [3]) We give an 0{^Jnm\ogn) algorithm, improving 

x This problem is also known as a constant jump system (see, e.g., [441 131)] ). 
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the previous 0(nm) algorithms by Abraham [T] and Harvey et al. [19jcl Our algorithm uses 
the same reduction to the min-cost flow problem as in [19J. However, instead of cancelling 
one negative cycle in each iteration, our algorithm exploits the structure of the graphs and 
the cost functions to cancel many negative cycles in a single iteration. This technique can 
also be generalized to any convex cost function. 

• Balanced Edge Cover: (Section 0]) We also present a reduction from the balanced edge 
cover problem to the unweighted semi-matching problem. This leads to an 0(y/nm log n) 
algorithm for the problem, improving the previous 0(nm) algorithm by Harada et al. [16J. 
The main idea is to identify the "center" vertices of all the clusters in the optimal solution. 
(Note that any balanced edge cover (in fact, any minimal edge cover) clusters the vertices 
into stars.) Then, we partition the vertices into two sides, center and non-center ones, and 
apply the semi-matching algorithm on this graph. 

2 Weighted semi-matching 

In this section, we present an algorithm that finds optimal weighted semi-matching in 0{nm log n) 
time. 



Overview 

Our improvement follows from studying the reduction from the weighted semi-matching problem 
to the weighted bipartite matching problem considered in the previous works [2U\ El [T7] and the 
Edmonds-Karp-Tomizawa (EKT) algorithm for finding the weighted bipartite matching [9H37]. We 
first review these briefly. For more detail, see Appendix lAl and iBl 



Reduction: As in [201 El [T7], we consider the reduction from the semi-matching problem on 
bipartite graph G = (U U V, E) to the minimum-weight bipartite matching on a graph G. The 
reduction is done by exploding the vertices in V, i.e., for each vertex v G V we create deg(u) 



vertices, v , v 



deg(u) 



. We also make copies of edges incident to v in the original graph G, i.e, 



for each vertex u GU such that uv £ E, we create edges uv ,uv z 



, uv 



degO) 



incident to v' L in G, we set its weight to i times its original weight in G, i.e, w v 
denote the set of these vertices by V v . Thus, we have 



For each edge uv 1 
= i ■ w llv . We 



G=(UUV,E) 



V 
E 



{v\v 2 



deg G (v) . 



V £ V} 



{uv\uv 2 ,...,v des G(v) -.uveE} 



w„ 



V™ € E, i € {1, 2, . . . , deg G (v)} 



The correctness of this reduction can be seen by replacing the edges incident to v in the semi- 
matching by the edges incident to f 1 ,^ 2 , . . . with weights in an decreasing order. For example, in 



Figure 1(a) , edge u\V\ and edge U2V% in the semi-matching in G correspond to u\v\ and u^v \ in 



the matching in G. The reduction is illustrated in Figure 1(a) 



2 We also observe an 0(n 5 / 2 logn) algorithm that arises directly from the reduction by applying |21] , 
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.ex. 

"30 S^O' 




Djyi where M= 



D|yj where M={ u 2 V 



(a) Reduction 



(b) Residual graphs 



Figure 1: 



This alone does not give an improvement on the semi-matching problem because the number 
of edges becomes 0(nm). However, we can apply some tricks to improve the running time. (See 
Appendix IB!) 



EKT algorithm: Our improvement comes from studying the behavior of the EKT algorithm for 
finding the bipartite matching in G. The EKT algorithm iteratively increases the cardinality of 
the matching by one by finding a shortest augmenting path. Such path can be found by applying 
Dijkstra's algorithm on the residual graph Dm (corresponding to a matching M) with a reduced 
cost, denoted by w as an edge length. 



Figure 1(b) shows examples of a residual graph Dm- The direction of an edge depends on 
whether it is in the matching or not. The weight of each edge depends on its weight in the original 
graph and the costs on its end vertices. We draw an edge of length from s to all vertices in Um 
and from all vertices in Vm to t, where Um and Vm are the sets of unmatched vertices in U and 
V, respectively. We want to find the shortest path from s to t or, equivalently, from Um to Vm- 

The reduced cost is computed from the potentials on the vertices, which can be found as in 
Algorithm ETT^l 

Applying EKT algorithm directly leads to an 0(n(n'logn' + m')) where n = \U\, n' = \U U V\ 
and m! is the number of edges in G. Since n' = \V\ = 0(m) and m' = B(n 2 ), the running time 
is 0(nm log n + n 3 ). (We note that this could be brought down to 0(n 3 ) by applying the result 
of Kao, Lam, Sung and Ting [21] to reduce the number of participating edges. See Appendix (Bj) 
The bottleneck here is the Dijkstra's algorithm which needs 0(n' log n' + m') time. We now review 
this algorithm and pinpoint the part that will be sped up. 



Dijkstra's algorithm: Recall that the Dijkstra's algorithm starts from a source vertex and keeps 
adding to its shortest path tree a vertex with minimum tentative distance. When a new vertex v is 
added, the algorithm updates the tentative distance of all vertices outside the tree by relaxing all 
edges incident to v. On an n'-vertex m'-edge graph, it takes O(logn') time (using priority queue) 
to find a new vertex to add to the tree and hence 0(n' log n') in total. Further, relaxing all edges 

3 Note that we set the potentials in an unusual way: We keep potentials of the unmatched vertices in V to 0. The 
reason is roughly that we can speed up the process of finding the distances of all vertices but vertices in Vm- Notice 
that this type of potentials is valid too (i.e., w is non-negative) since for any edge uv such that v £ Vm is unmatched, 
Wuv = w U v + p(u) - p(v) = w U v + p{u) > 0. 
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Algorithm 2.1 EKT Algorithm (G,w) 



l: Let M = 0. 

2: For every node i>, let p(v) = 0. (p(i>) is a potential on u.) 

3: repeat 

4: Let 'UJuv = w uv + p(u) — p(v) for every edge uu. (w uv is a reduced cost of an edge uv.) 

5: For every node v, compute the distance d(v) which is the distance from Um (the set of 

unmatched vertices in U) to v in Dm- (Recall that the length of edges in Dm is w.) 
6: Let P be the shortest Um-Vm path in Dm- 

7: Update the potential p(u) to d(u) for every vertex u £ U U (V \ Vm)- 

8: Augment M along P, i.e., M = PAM (where A denotes the symmetric difference operator). 

9: until all vertices in U are matched 

10: return M 



takes 0(m') time in total. Recall that in our case, ml = 0(ra 2 ) which is too large. Thus, we wish 
to reduce the number of edge relaxations to improve the overall running time. 



Our approach: We reduce the number of edge relaxation as follows. Suppose that a vertex 
u G U is added to the shortest path tree. For every v S V, a neighbor of u in G, we relax all edges 
uv 1 , to 2 , . . ., uv 1 in G at the same time. In other words, instead of relaxing 0(nm) edges in G 
separately, we group the edges to m groups (according to the edges in G) and relax all edges in each 
group together. We develop a relaxation method that takes O(logn) time per group. In particular, 
we design a data structure H v , for each vertex v € V, that supports the following operations. 

• Relax(?xu, H v ): This operation works as if it relaxes edges uv , uv 2 , . . . 

• AccessMin(^): This operation returns a vertex v l (exploded from v) with minimum ten- 
tative distance among vertices that are not deleted (by the next operation). 

• DeleteMin(^): This operation finds v l from AccessMin and then returns and deletes v l . 

Our main result is that, by exploiting the structure of the problem, one can design H v that 
supports Relax, AccessMin and DeleteMin in O(logn), O(l) and O(logn) respectively. Before 
showing such result, we note that speeding up Dijkstra's algorithm and hence EKT algorithm is 
quite straightforward once we have H v : We simply build a binary heap H whose nodes correspond 
to vertices in an original graph G. For each vertex u £ U, H keeps track of its tentative distance. 
For each vertex v £ V, H keeps track of its minimum tentative distance returned from H v . 



Main idea: Before going into details, we sketch the main idea here. The data structure H v that 
allows fast "group relaxation" operation can be built because of the following nice structure of the 
reduction: For each edge uv of weight w uv in G, the weights w uv i , w uv 2 , . . . of the corresponding 
edges in G increase linearly (i.e., w uv , 2w uv , 3w uv , . . .). This enables us to know the order of vertices, 
among v , v 2 , . . ., that will be added to the shortest path tree. For example, in Figure 1(b) , when 
M = 0, we know that, among v 1 and v 2 , v 1 will be added to the shortest path tree first as it always 
has a smaller tentative distance. 
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However, since the length of edges in Dm does not solely depend on the weights of the edges 
in G (in particular, it also depends on a potentials on both end vertices), it is possible (after some 
iterations of the EKT algorithm) that v 1 is added to the shortest path tree after v 2 . 

Fortunately, due to the way the potential is defined by the EKT algorithm, a similar nice 
property still holds: Among v 1 ^ 2 , ... in Dm corresponding to v in G, if a vertex v k , for some 
k, is added to the shortest path tree first, then the vertices on each side of v k have a nice order: 
Among v 1 , v 2 , . . . , v k ~ l , the order of vertices added to the shortest path tree is v k ~ 1 , v k ~ 2 , . . . , v 2 , u 1 . 
Further, among v k+1 , v k+2 , . . ., the order of vertices added to the shortest path tree is v k+1 , v k+2 , 

This main property, along with a few other observations, allow us to construct the data structure 
H v . In the next section, we show the properties we need and use them to construct H v in the latter 
section. 



2.1 Properties of the tentative distance 

Consider any iteration of the EKT algorithm (with a potential function p and a matching M). We 
study the following functions /* K and g* v . 

Definition 2.1. For any edge uv from U to V and any integer 1 < i < deg(f), let 

9uv(i) = d(u) +p(u) +i ■ w uv and f uv (i) = g U v(i) - p(v l ) = d(u) + p(u) - p{v % ) +i ■ w uv . 
For any v G V and i G [deg(-u)], define the lower envelope of f uv and g uv over all u £ U as 

f* v (i) = min f uv (i) and g* v (i) = min g uv (i). 

u:uv£E u:uv£E 

Our goal is to understand the structure of the function f* v whose values / w (l), / w (2), . . . are 
tentative distances of v , v 2 , . . ., respectively. The function g* v is simply /*„ with the potential of 
v ignored. We define g* v as it is easier to keep track of since it is a combination of linear functions 
g uv and therefore piecewise linear. Now we state the key properties that enable us to keep track of 
f* v efficiently. Recall that v l ,v 2 , . . . are the exploded vertices of v (from the reduction). 

Proposition 2.2. Consider a matching M and a potential p at any iteration of the EKT algorithm. 

a v +l ydeg(v) 



(1) For any vertex v G V , there exists a v such that v 1 , . . . , v av are all matched and v 
are all unmatched. 

(2) For any vertex v £V, g* v is a piecewise linear function. 

(3) For any edge uv G E where u G U and v G V, and any i, f uv (i) = f* v {i) if and only if 
9uv(i) = g*v(i)- 

(4) For any edge uv G E where u G U and v £ V , let a v be as in (1). There exists an integer 1 < 
luv < k such that fori = 1, 2, . . . , J uv -1, f U v(i) > fuv(i+~L) and fori = 7 OT ,7„«+1, ... ,a v -l, 
fuv(i) < fuv(i + 1)- In other words, f uv (l), f uv (2), . . . , f uv (a v ) is a unimodal sequence. 



Figure 2(a) and 2(b) show the structure of g* v and f* v according to statement (2) and (4) in 
the above proposition. By statement (3), the two pictures can be combined as in Figure 2(c) g* v 
indicates u that makes both g* v and f* v minimum in each interval and one can find i that minimizes 

in each interval by looking at a v (or near a v in some case). 
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(a) g tv and potential function. Note (b) /,„ is unimodal. (c) /*„ together with g* v . 

that W U1V Z> W U 2V ^ Wu^v • 



Figure 2: 

Proof. 

(1) The first statement follows from the following claim. 

Claim 2.3. For any i, if the exploded vertex v' l+1 of v (in V v ) is matched by M, then v l is also 
matched. 

Proof. The claim follows from the fact that EKT algorithm maintains M so that M is an extreme 
matching. Suppose that v l+1 is matched by M (i.e., uv t+l E M), but v % is not matched. Then we 
can remove uv l+1 from M and add uv l to M. The resulting matching will have a cost less than M 
but have the same cardinality, a contradiction. □ 

(2) To see the second statement, notice that g uv = d(u) + p(u) + i ■ w uv is linear for a fixed uv E E. 
Hence, g* v is a lower envelope of a linear function implying that it is piecewise linear. 

(3) To prove the third statement, recall that for any u and any i, f uv (i) = g U v(i) —p(v l ). Therefore, 
for any u, u' and i, f uv (i) > f u 'v(i) if and only if g uv (i) > g u 'v(i)- Thus, the third statement follows. 

(4) For the fourth claim, we first explain the intuition. First, observe that the function g uv is 
increasing with rate w uv . Moreover, the difference of f uv (i) and f U v(j) is a function of the potential 
p{v l ) and p{v 3 ) and the multiple of edge weight {j — i)w uv . In fact, whether the difference is negative 
or positive depends on the value of these three parameters. We show that these parameters change 
monotonically and so we have the desired property. 

To prove the fourth statement formally. We first prove two claims. 

For the first claim below, recall that the potential of matched vertices, at any iteration, is 
defined to be the distance on the residual graph of the previous iteration. In particular, for any 
v l E V, there is a vertex u E U such that p{u) + i ■ w uv = p(v). (See Algorithm 12.11 ) 

Claim 2.4. For any integer i < a v , consider the exploded vertices v % and v t+1 . Let u and u' 
denote two vertices in U such that p{u) + i ■ w uv = p(v l ) and p(u') + (i + 1) • w u > v = p{v l+l ). Then 
Wuv > p(v t+1 ) - p(v l ) > W u 'v 

Proof. The first part, w u / v > p(v l+l ) — p(v l ), follows from p(v % ) = p(u) + i ■ w uv and p(v l+1 ) < 
p(u) + (i + 1) • Wily . The second part, p(v l+1 ) — p{v l ) > w u > v , follows from, p(v l ) < p(u') + i ■ w u > v 
and p{v l+1 ) = p{u') + (i + 1) ■ w u ' v . □ 

Proof of the next claim follows directly from the definition of f uv (cf. Definition 12. ip . 
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Claim 2.5. For any i < a v , f U v(i) > fuv{i + 1) if and only if p(v l+1 ) — p(v l ) > w uv and f uv (i) < 
fuv(i + 1) */ and only if p(v l+1 ) - p(v l ) < w uv . 

Now, the fourth statement in the Proposition follows from the following statements: For any 
integer i < a v , 

(i) if f uv {i) > f uv (i + 1), then f uv (j) > f uv (j + 1) for any integer j < i, and 

(ii) if f uv (i) < f uv (i + 1), then f uv (j) < f uv (j + 1) for any integer i< j < a v . 

To prove the first statement, let v! be such that p(u') + i • w u i v = p(v l ). If f uv (i) > f U v(i + 1), 
then 

p(u*) - p(t> i_1 ) > w u , v > p(v i+1 ) - p(v l ) > w uv 

where the first two inequalities follow from Claim [2^1 and the third inequality follows from Claim [231 
It then follows from Claim [231 that f uv {i — 1) > f U v(i)- The first statement follows by repeating 
the argument above. The second statement can be proved similarly. This completes the proof of 
the fourth statement. □ 



2.2 Data structure 

Specification: Let us first redefine the problem so that we can talk about the data structure in 
a more general way. We show how to use this data structure for the semi-matching problem in the 
next section. 

Let n and N be positive integers and, for any integer i, define [i] = {1, 2, . . . , i}. We would like 
to maintain at most n functions fx, f%, . . . , f n mapping [N] to a set of positive reals. We assume 
that fi is given as an oracle, i.e., we can get fi(x) by sending a query x to fi in O(l) time. 

Let L and S be a subset of [N] and [n], respectively. (As we will see shortly, we use L to 
keep the numbers left undeleted in the process and S to keep the functions inserted to the data 
structure.) Initially, L = [N] and S = 0. For any x G [N], let fg(x) = mm f i£ s fi(x). We want to 
construct a data structure Ti that supports the following operations. 

• AccessMin(%): Return x G L with minimum value fg, i.e., x = argmin xg i fs( x )- 

• lNSERT(/j, %): Insert fi to S. 

• DeleteMin('H): Delete x from L where x is returned from AccessMin(%). 
Properties: We assume that fx, f%, . . . have the following properties. 

• For all i, fi is unimodal, i.e., there is some ji G [N] such that > /i(2) > . . . > /i(7«) < 
fiili + 1) < fiili + 2) < . . . < fi(N) . We assume that 7, is given along with fi. 

• We also assume that each fi comes along with a linear function gi where, for any x G [N], 
g%{x) = x-Wi+di, for some Wi and dj. These linear functions have a property that f%{x) = fg{x) 
if and only if g^x) = g* s (x), where g* s (x) = mm ieS gi(x). 

• Finally, we assume that once x is deleted from L, fg(x) will never change, even after we add 
more functions to S. 

For simplicity, we also assume that Wi 7^ Wj for all i ^ j. This assumption can be removed by 
taking care of the case of equal weight in the insert operation. We now show that there is a data 
structure such that every operation can be done in O(logn) time. 
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Data structure design: We have two data structures to maintain the information of /j's and 
<7i's. First, we create a data structure T g to maintain an ordered sequence gi 1: gi 2 , ■ ■ ■ such that 
Wh > w% 2 > We want to be able to insert a new function gi to T g in O(logn) time. Moreover, 
for any u>, we want to be able to find and such that u;^ < u) < in O(logn) time. 

Such T g can be implemented by a balanced binary search tree, e.g., an AVL tree. 

Observe that the linear functions g^, gi 2 , . . . appear in the lower envelope in order, i.e., if 
gij{x) > gi j+1 (x), then gi^y) > gi J+1 {y) for any y > x. Therefore, we can use data structure T g to 
maintain the range of values such that each gi (and therefore fi) is in the lower envelope. That is, 
we use T g to maintain x\ < y\ < x<± < 2/2 < • • • such that g%{x) = g%(x) for all % and X{ < x < yi). 

Consider the value vwn X £{ Xi ,xi+l, ...,yi}nL fii x )- Since fi is unimodal, the minimum value of 
fi(x) over {xi,Xi + 1, . . . , yi} n L attains at the point closest to 7^ either from the left or from the 
right. Thus, we can use two pointers pi and qi such that xi < pi < 73 < qi < yi to maintain the 
minimum value of fi from the left and right of ji, i.e., the minimum value min rg {, r . a^+i, ... tyi }nL fi( x ) 
is either fi(pi) or Finally, we use a binary heap B to store the values fi(pi), faipz),--- and 

fl(qi), /2fe)) • • ■ so that we can search and delete the minimum among these values in O(logra) 
time. 

More details of the implementation of each operation are the followings. 

• AccessMin(%): This operation is done by returning the minimum value in B. This value 
is min(/i(pi), / 2 (p 2 ), ■ ■ ■ , /l(gi), j^fe), ■ ■ ■) = min^L fsi x )- 

• Insert(/j, 7i): First, insert gi to T g which can be done as follows. Let the current ordered 
sequence be g^^g^, . . .. In O(logn) time, we find a. and gu +1 such that W{- < Wi < Wi j+1 
and insert gi between them. Moreover, we update the region g^, gi, and g% +1 are in the lower 
envelope of g* s , i.e., we get the values yi v Xi,yi,Xi j+1 ,y ij+1 (note that y ij < X{ < yi < x ij+1 < 

Vij+i)- 

Next, we deal with the pointers pi and We set pi = min(7j, yi) and qi = max(7j, Xj). (The 
intuition here is that we would like to set Pi = q- L = 7« but it is possible that 7, < Xi or 7^ > yi 
which means that ji is not in the region that gi is in the lower envelope g^). Finally, we also 
update p^ and q ij+1 : p i:j = min^.,^) and q ij+1 = max(g ij+1 ,?/ i ). Figure [3] shows an effect 
of inserting a new function. 

We note one technical detail here: It is possible that pi is already deleted from L. This implies 
that there is another function /j., such that fi j ,{pi) = fi{pi) (since we assume that if is 
already deleted, then fg{pi) will never change even when we add more functions to S). There 
are two cases: j' < j or f > j. For the former case, we know that fi^iPi ~ 1) < fi(Pi ~ 1) 
since wy > Wj and thus we simply do nothing (pi will never be returned by AccessMin). 
For the latter case, we know that /j., {pi — 1) > fi(pi — 1) and thus we simply set pi to pi — 1. 
We deal with the same case for qi similarly. 

• DeleteMin("H): We delete the node with minimum value from B (which is the one on top of 
the heap). This deleted node corresponds to one of the values /i(pi), i^fe), • • • , fi(qi), /^fe), 
Assume that fi(pi) (resp. fi(qi)) is such value. We insert a node with value fi{pi — 1) (resp. 



10 



Before insert f 



After insert f 

(value of p is changed) 



Figure 3: Inserting a new function 



2.3 Using the data structure for semi-matching problem 

For any right vertex v, we construct a data structure H v as in Section [2.21 to maintain f uv for all 
neighbor of v which comes along with g uv . These functions satisfy the properties above, as shown 
in Section |2"TT1 (We note that once x is deleted, f* v (x) will never change since this corresponds to 
adding a vertex v x to the shortest path tree with distance f* v (x).) 

The last issue is how to find j uv , the lowest point of an edge uv quickly. We now show an 
algorithm that finds j uv , for every edge uv G E in time 0(|V| + \E\) in total. This algorithm can 
be run before we start each iteration of the main algorithm (i.e., above Line |4] of Algorithm 12. ip . 
To derive such algorithm, we need the following observation. 

Lemma 2.6. For any v G V and u%,U2 G U , if w UlV > w U2V , then j UlV < 7 U2l) . 

Proof. Note that by Lemma I2.5| j uv is the minimum integer i G [deg(u)] such that p{v l+1 ) — 
p(v l ) < w uv . Also, for any j < q(uiv), p(v j+1 ) - p(v l ) > w UlV by definition. If j UlV > 7 n2 „, then 
p ( v q(u 2 v)+lj _ p( v q(u 2 v)} > w ^ v _ However, p( v i^ +l ) - p{v q{ - u ^) < w U2V . So, w UlV < w U2V . □ 

Algorithm: The following algorithm finds j uv for all uv G E. First, in the preprocessing step 
(which is done once before we begin the main algorithm), we order edges incident to v decreasingly 
by their weights, for every vertex v G V. This process takes 0(deg(v) log(deg(u))). We only have 
to compute ^ uv once, so this process does not affect the overall running time. 

Next, for any v G V, suppose that the list is (m,U2, • • • i u deg(v))- Since w Ul > w U2 > ... > 
^deg(u)) it implies that j UlV < ^ U2V < ... 'j Udcg(v) v by LemmaEBJ So, we first find j Ul v and then j U2V 
and so on. This step takes 0(deg(v)) for each v G V and 0(m) in total. Therefore, the running 
time for computing the minimum point 7 utJ 's is 0(m log n). 

3 Unweighted semi-matching 

In this section, we present an algorithm that finds the optimal semi-matching in unweighted graph 
in 0{vrt\fn log n) time. 
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Overview 



Our algorithm consists of the following three steps. 

In the first step, we reduce the problem to the min-cost flow problem, using the same reduction 
from Harvey et al. [19]. (See Figure HI) The details are provided in Section 13.11 We note that 
the flow is optimal if and only if there is no cost reducing path (to be defined later). We start 
with an arbitrary semi-matching and use this reduction to get a corresponding flow. The goal is to 
eliminate all the cost-reducing paths. 

The second step is a divide-and-conquer algorithm used to eliminate all the cost-reducing paths. 
We call this algorithm Cancel All (cf. Algorithm 13. ip . The main idea here is to divide the 
graph into two subgraphs so that eliminating cost reducing paths "inside" each subgraph does not 
introduce any new cost reducing paths going through the other. This dividing step needs to be 
done carefully. We treat this in Section 13.21 

Finally, in the last component of the algorithm we deal with eliminating cost-reducing paths 
between two sets of vertices quickly. Naively, one can do this using any unit-capacity max-flow 
algorithm, but this does not give an improvement on the running time. To get a faster algorithm, we 
observe that the structure of the graph is similar to a unit network, where every vertex has in-degree 
or out-degree one. Thus, we get the same performance guarantee as the Dinitz's algorithm [8] Q 
Details of this part can be found in Section 13.31 

After presenting the algorithm in the next three sections, we analyze the running time in 
Section 13.41 We note that this algorithm also works in a more general cost function (discussed 
in Section I3.5D . We also observe an 0(n 5 / 2 log n)-time algorithm that arises directly from the 
reduction of the weighted case (discussed in Appendix [Bj) . This already gives an improvement over 
the previous results but our result presented here improve the running time further. 

3.1 Reduction to min-cost flow and optimality characterization (revisited) 

In this section, we review the characterization of the optimality of the semi-matching in the min- 
cost flow framework. We use the reduction as given in [19 . Given a bipartite graph G = (UL)V, E), 
we construct a directed graph N as follows. Let A denote the maximum degree of the vertices in 
V. First, add a set of vertices, called cost centers, C = {c±,C2, ■ ■ ■ ,ca} and connect each v 6 V to 
Cj with edges of capacity 1 and cost i, for all 1 < i < deg(-u). Second, add s and i as a source and 
sink vertex. For each vertex in U, add an edge from s to it with zero cost and unit capacity. For 
each cost center q, add an edge to t with zero cost and infinite capacity. Finally, direct each edge 
e € E from U to V with capacity 1 and cost 0. Observe that the new graph N has 0(n) vertices 
and 0(m) edges, and any semi-matching in G corresponds to a max flow in N. 

Observe that the new graph N contains 0(n) vertices and 0(m) edges. It can be seen that 
any semi-matching in G corresponds to a max flow in N. (See example in Figure HI) Moreover, 
Harvey et al. [19] prove that an optimal semi-matching in G corresponds to a min-cost flow in TV; 
in other words, the reduction described above is correct. Our algorithm based on observation that 
the largest cost is 0(|C/|). This allows one to use the cost-scaling framework to solve the problem. 

Now, we review an optimality characterization of the min-cost flow. We need to define a cost- 
reducing path first. Let Rf denote the residual graph of N with respect to a flow /. We call any 
path p from a cost center q to c, in Rf an admissible path and call p a cost-reducing path if i > j. 

4 The algorithm is also known as "Dinic's algorithm". See [H] for details. 
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Figure 4: Reduction to the min-cost flow problem. Each edge is labelled with (cost, capacity) 
constraint. Thick edges either are matching edges or contain the flow. 

A cost-reducing path is one-to-one corresponding to a negative cost cycle implying the condition 
for the minimality of /. Harvey et al [19] proved the following. 

Lemma 3.1 (|19j). A flow f is a min-cost flow in N if and only if there is no cost-reducing path 
in Rf(N). 

Proof. Note that / is a min-cost flow if and only if there is no negative cycle in Rj. To prove the 
"only if part, assume that there is an cost-reducing path from Cj to Cj. We consider the shortest 
one, i.e., no cost center is contained the path except the first and the last vertices. The edges that 
effect the cost of this path are only the first and the last ones because only edges incident to cost 
centers have cost. Cost of the first and the last edge is —i and j respectively. Connecting Cj and Cj 
with t results a cycle of cost j — i < 0. 

For the "if part, assume that there is a negative-cost cycle in Rf. Consider the shortest cycle 
which contains only two cost centers, say Cj and Cj where i > j. This cycle contains an admissible 
path from Cj to Cj. □ 

Given a max-flow / and a cost-reducing path P, one can find a flow /' with lower cost by 
augmenting / along P with a unit flow. This is later called path cancelling. We are now ready to 
explain our algorithm. 

3.2 The divide-and-conquer algorithm 

Our algorithm takes a bipartite graph G = (U U V, E') and outputs the optimal semi-matching. It 
starts by transforming G into a graph N as described in the previous section. Since the source s 
and the sink t are always clear from the context, the graph N can be seen as a tripartite graph with 
vertices U U V U C; later on, we denote N = (U UV U C, E). The algorithm proceeds by finding 
an arbitrary max-flow / from s to t in N which corresponds to a semi-matching in G. This can be 
done in linear time since the flow is equivalent to any semi-matching in G. 

To find the min-cost flow in N, the algorithm uses a subroutine called CancelAll (cf. Algo- 
rithm [3J]) to cancel all cost-reducing paths in /. Lemma 13 . 1 1 ensures that the final flow is optimal. 

CancelAll works by dividing C and solves the problem recursively. Given a set of cost centers 
C, the algorithm divides C into roughly equal-size subsets C\ and C2 such that, for any q E C\ and 
Cj E C2, i < j ■ This guarantees that there is no cost reducing path from C\ to Ci- Then it cancels 
all cost reducing paths from C2 to C\ by calling Cancel algorithm (described in Section [373]) . 

It is left to cancel the cost-reducing paths "inside" each of C\ and Ci- This is done by parti- 
tioning the vertices of N (except s and t) and forming two subgraphs Ni and N2. Then solve the 
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Algorithm 3.1 CancelAll(./V = (UUVUC, E)) 
1: if|C| = 1 then halt endif 
2: Divide C into C\ and C2 of roughly equal size. 

3: CANCEL(iV, C2, C\). {Cancel all cost-reducing paths from C2 to C%}. 

4: Divide ./V into N± and N2 where N2 is "reachable" from C2 and iVi is the rest. 

5: Recursively solve CancelAll(Ai) and CancelAll^A^). 



problem separately on each of them. In more detail, we partition the graph iV by letting N2 be 
a subgraph induced by vertices reachable from C2 in the residual graph and N\ be the subgraph 
induced by the rest vertices. (Note that both graphs have s and t.) For example, in Figure [U v\ is 
reachable from C3 by the path C3, V2, U2, v\ in the residual graph. 

Lemma 3.2. Cancel All(_/V) (cf. Algorithm \3.1\) cancels all cost-reducing paths in N. 

Proof. Recall that all cost-reducing paths from C2 to C\ are cancelled in line El Let 5 denote the 
set of vertices reachable from C2. 

Claim 3.3. After lineal no admissible paths between two cost centers in C\ intersect S. 

Proof. Assume, for the sake of contradiction, that there exists an admissible path from x to y, 
where x,y E Ci, that contains a vertex s £ S. Since s is reachable from some vertex z G C2, there 
must exist an admissible path from some vertex in z to y; this leads to a contradiction. □ 

This claim implies that, in our dividing step, all cost-reducing paths between pairs of cost 
centers in C% remain entirely in N\ . Furthermore, vertices in any cost reducing path between pairs 
of cost centers in C2 must be reachable from C2; thus, they must be inside S. Therefore, after 
the recursive calls, no cost-reducing paths between pairs of cost centers in the same subproblems 
Ci are left. The lemma follows if we can show that in these processes we do not introduce more 
cost-reducing paths from C2 to C\. To see this, note that all edges between N\ and N2 remain 
untouched in the recursive calls. Moreover, these edges are directed from JVi to N%, because of the 
maximality of S. Therefore there is no admissible path from C2 to C\. □ 

3.3 Cancelling paths from C 2 to C\ 

In this section we describe an algorithm that cancels all admissible paths from C2 to C\ in Rf, 
which can be done by finding a max flow from C2 to C\. To simplify the presentation, we assume 
that there is a super-source s and super-sink t connecting to vertices in C2 and in C%, respectively. 

To find a maximum flow, observe that A^ is unit-capacity and every vertex of U has indegree 
1 in Rf. By exploiting these properties, we show that Dinitz's blocking flow algorithm [7J can 
find a maximum flow in 0(|£^|-i/|C7|) time. The algorithm is done by repeatedly augmenting flows 
through the shortest augmenting paths, (see Appendix IC|) . 

Lemma 3.4. Let d{ be the length of the shortest s — t path in the residual graph at the i th iteration. 
For all i, di+i > di. 

The lemma can be used to show that Dinitz's algorithm terminates after n rounds of the 
blocking flow step, where n is the number of vertices. Since after the n-th round, the distance 
between the source is more than n, which means that there is no augmenting path from s to t 
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in the residual graph. The number of rounds can be improved for certain classes of problems. 
Even and Tarjan [TO] and Karzanov [22] showed that in unit capacity networks, Dinitz's algorithm 
terminates after min(n 2 / 3 , m 1 / 2 ) rounds, where m is the number of edges. Also, in unit networks, 
where every vertex has in-degree one or out-degree one, Dinitz's algorithm terminates in 0(y/n) 
time (see, e.g., Tarjan's book [16]). Since the graph N we are considering is very similar to unit 
networks, we are able to show that Dinitz's algorithm also terminates in 0(y/n) in our case. 

For any flow /, a residual flow f is a flow in a residual graph Rf of /. If /' is maximum in Rf, 
f + /' is maximum in the original graph. The following lemma relates the amount of the maximum 
residual flow with the shortest distance from s to t in our case. The proof is a modification of 
Theorem 8.8 in [46J. 

Lemma 3.5. If the shortest s — t distance in the residual graph is d > 4, the amount of the 
maximum residual flow is at most 0(\U\/d). 

Proof. A maximum residual flow in a unit capacity network can be decomposed into a set V of 
edge-disjoint paths where the number of paths equals to the flow value. Each of these paths are 
of length at least d. Clearly, each path contains the source, the sink, and exactly two cost centers. 
Now consider any path P G V of length I. It contains I — 3 vertices from U U V. Since the original 
graph is a bipartite graph, at least [(I — 3)/2j > L(^ — 3) /2j > (d— 4)/2 vertices are from U. Note 
that each path in V contains a disjoint set of vertices in U, since a vertex in U has in-degree one. 
Therefore, we conclude that there are at most 2\U\/(d — 4) paths in V . The lemma follows since 
each path has one unit of flows. □ 

From these two lemma, we have the main lemma for this section. 

Lemma 3.6. Cancel terminates in 0(\E\ \J\U\) time. 

Proof. Since each iteration can be done in 0(|i£|) time, it is enough to prove that the algorithm 
terminates in 0(a/|{7|) rounds. The previous lemma implies that the amount of the maximum 
residual flow after the 0(y^\U\)-th rounds is 0(y^\U\) units. The lemma thus follows because after 
that the algorithm augments at least one unit of flow for each round. □ 

3.4 Running time 

The running time of the algorithm is dominated by the running time of Cancel All, which can 
be analyzed as follows. Let T(n,n' ,m,k) denote the running time of the algorithm when \U\ = 
n, \V\ = n', \E\ = m, and \C\ = k. For simplicity, assume that k is a power of two. By Lemma 13.61 
Cancel runs in 0(|£?|-^/|C7[) time. Therefore, 

T(n, n \ m,k) < c ■ myfn + T{n\, rii, mi, k/2) + T(n2, n 2 , m2, k/2), 

for some constant c, where rii, n^, and mi denote the number of vertices and edges in Nt, respectively. 
Recall that each edge participates in at most one of the subproblems; thus, mi + 777,2 < "7- Observe 
that the number of cost centers always decrease by a factor of two. Thus, the recurrence is solved 
to 0(y/nm log k). Since k = 0(\U\), the running time is 0{y/nm log 77) as claimed. Furthermore, 
the algorithm can work in more general cost function with the same running time as shown in the 
next section. 
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3.5 Generalizations of an unweighted algorithm 

The problem can be viewed in a slightly more general version. In Harvey et al. [19] , the cost 
functions for each vertex v £ V are the same. We relax this condition, allowing different function 
for each vertex where each function is convex. More precisely, for each v E V, let /„ : Z + — > M be 
a convex function, i.e., for any i, f v (i + 1) — f v (i) > f v {i) — f v (i — 1). The cost for matching M on 
vertex v is f v (deg M (v)). In this convex cost function, the transformation similar to what described 
in Section [3.11 can still be done. However, the number of different values of /„ is now 0(|i£|). So, 
the size of the set of cost centers C is now upper bounded by Od-E 1 !) not 0(|C/|). Therefore, the 
running time of our algorithm becomes 0(|£'|y / |f7[log \C\) = 0(\E\ -\J\U\ log \E\) = 0{y/nm\ogn) 
(since \E\ < n 2 ) which is the same as before. 

4 Extension to Balanced Edge Cover problem 

Recall that the problem is the following. 
Input: A simple undirected graph G = (V,E). 

Task: Find an edge cover F minimizing c(F) = Y2vgv degp(f), where deg F (v) = \{vu S F}\ and 
we say that F is an edge cover if deg F (v) > 1 for all v G V[j 

We call the solution of such problem an optimal balanced edge cover. Observe that any minimal 
edge cover - including any optimal balanced edge cover - induces a star forest; i.e., every connected 
component has at most one vertex of degree greater than one (we call such vertices centers) and 
the rest have degree exactly one. For any optimal balanced edge cover F, we call any set of vertices 
C an extended set of centers of F if C contains all centers of F and exactly one vertex from each 
of the connected components that have no center. (To be precise, C is a center if C contains all 
centers of F and each connected component in the subgraph induced by F contains exactly one 
vertex in C) 

To solve the balanced edge cover problem using semi-matching algorithm, we first make a further 
observation that if an extended set of centers is given, then optimal balanced edge cover can be 
found by simply solving the unweighted semi-matching problem. 

Lemma 4.1. Let C be an extended set of centers of some optimal balanced edge cover F . Let 
G' = ((V \ C) U C, E') be a bipartite graph where E' is the set of edges between V\C and C in G. 
Then, an optimal semi-matching in G' (where we allow vertices in C to be connected more than 
once) is an optimal balanced edge cover in G. 

Proof. Let M be any optimal semi-matching in G' . First, note that F is also a semi-matching in 
G'. Thus, the cost of M is less than the cost of F. It is left to show that M is an edge cover. In 
other words, it is left to prove that every vertex in C is covered by M. 

Assume for the sake of contradiction that there is a vertex v £ C that is not covered by M. 
We show that there exists a cost-reducing path of M starting from v as follows. Starting from 

5 We note that the original definition of the balanced edge cover problem has a function / : Z + — > K + as an 
input |16] . However, it is shown in [16] that the optimal balanced edge cover can be determined independently of 
function / as long as / is strictly monotonic. In other words, the problem is equivalent to the one we define here. 
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vo = v, let v\ be any vertex adjacent to vq in F and V2 be a (unique) vertex adjacent to v\ in 
M. If deg M {v2) > 1, then we stop the process. Otherwise, repeat the process by finding a vertex 
t>3 adjacent to i>2 in F and a vertex V4 adjacent to V3. Observe that all vertices found during the 
process are unique since every vertex in V \ C has degree exactly one in F, every vertex found in 
the process, except t>o, has degree exactly one in M, and there is no edge in F between two vertices 
in C. Therefore, the process above must stop. Moreover, the path obtained after the process stops 
is a cost-reducing path, contradicting the assumption that M is an optimal semi-matching. □ 

It is left to find any extended set of centers. We do so by define levels of vertices based on some 
edge cover. 

Definition 4.2. For any edge cover F, define the levelling of vertices in F, denoted by Lp, as 
follows. 

First, let all center vertices (i.e., all vertices with degree more than one in F) be on level 1. For 
i = 1, 2, . . ., we construct level % + 1 by looking at any vertex v not yet assigned to any level. If % 
is odd and v shares an edge in F with a vertex on level i, then we add v to level i + 1. Otherwise, 
we add v to level i + 1 if i is even, v shares an edge not in F with a vertex on level i and v does 
not share an edge in F with any vertex on level i + 1. (For example, after we put vertices of degree 
more than one in the first level, we put their leaves on level two. Then, we put all vertices adjacent 
to these leaves by non-covering edges on level three. But, if we see a single edge having both end 
vertices on level three, we put one end to level four and so on.) 

Note that when the process finishes, there might be some vertices that are not assigned to any 
level. □ 

An optimal balanced edge cover can be found by the following algorithm. 

Find-Center Algorithm: First, find a minimum cardinality edge cover F. Then, find Lp in 
a breadth-first manner. Let M be an optimal semi-matching of a bipartite graph where the left 
vertices are even- level vertices and right vertices are odd- level vertices. Output M and edges in F 
between vertices with no level. 

Now, we show the running time and the correctness of Find-Center algorithm. 

Running time analysis: F can be found by simply adding uncovered vertices to a maximum 
cardinality matching [13, [35] . The maximum cardinality matching in bipartite graph can be found 
by Micali-Vazirani algorithm [34J in 0(y/nm), or in 0(n w ) by Harvey algorithm |18| . where to is 
a time for computing matrix multiplication. However, since the running time of semi-matching 
algorithm is 0{^/nm log n). It suffices to use the first one. Thus, F can be found in 0(y/nm) time. 
Moreover, finding Lp could be done in a breadth-first manner which takes 0(n + \F\ + \M\) = O(n) 
time. Therefore, the time for the reduction from balanced edge cover to semi-matching problem is 
0(y / nm) implying the total running time of 0{^/nm\ogn). 

Correctness: 

The proof of correctness uses an algorithm BEC1 proposed in |16j . This algorithm starts from any 
minimum edge cover and keep augmenting along a cost-reducing path until such path does not 
exist. Here a cost-reducing path regarding to an edge cover F is a path starting from any center 
vertex u, follow any edge in F and follow an edge not in F. The path keeps using edges in F and 
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edges not in F alternately until it finally uses an edge not in F and ends at a vertex v such that 
deg^(f) < deg^(n) — 2. (See [16] for the formal definition.) It is shown that BEC1 returns an 
optimal balanced edge cover. 

Lemma 4.3. Let C be the set returned from the Find-Center algorithm. Then C is an extended 
set of centers of some optimal balanced edge cover F* . In other words, there exists F* such that all 
of its centers are in C and each connected component (in the subgraph induced by F*) has exactly 
one vertex in C . 

Proof. Let F be the minimum cardinality edge cover found by the Find-Center algorithm. Con- 
sider a variation of BEC1 algorithm where we augment along a shortest cost-reducing path. We 
claim that we can always augment along the shortest cost-reducing path in such a way that parity 
of vertices' levels never change. To be precise, we construct a sequence of minimum cardinality edge 
covers F = F%, F%, . . . where we get Fi from by augmenting along some shortest cost-reducing 
path. By the following process, we guarantee that if any vertex is on an odd (even) level in Lp 
then it is on an odd (even) level in Lp i . Moreover, if a vertex belongs to no level in Lp then it has 
no level in Lp i . 

Suppose we are at Fi and our guarantee is maintained so far. Let P be any shortest cost-reducing 
path on Fi. If there is no such P, then we found F*. Otherwise, we consider two cases. 

Case 1 If P contains only vertices on level 1 and 2: This is equivalent to reconnecting vertices on 
level 2 to vertices on level 1. Level of every vertex is the same in Lp i and Lp i+1 . Thus, the 
guarantee is maintained. 

Case 2 Otherwise: Let P = VqViV^ ■ ■ ■ v^. Recall that k is even and note that all of Vq, vx, . . . , v^-i 
must be on level 1 and 2 (alternately); otherwise, we can stop at the first vertex that we visit 
on other level and obtain a shorter cost-reducing path. Now, let us augment from vq until we 
reach Vk-2- At this point, Vk-\ must have degree at least three (after augmentation) because 
it is on level 1 (which means that it has degree more than one in Fi) and just receives one 
more edge from augmentation. If Vk is on level 3, then we are done as it will be on level 1 in 
Lp i+1 and all vertices in its subtree will be 2 levels higher. If not, then Vf. must be on level 4. 
Let a be a vertex adjacent to Vk by an edge in Fi (which is on level 3) and let b be a vertex 
on level 2 adjacent to a (by an edge not in Fi). There are two subcases. 

Case 2.1 When v^-i = b: In this case, we use a path v±V2 ■ ■ ■ Vk-ia instead. 

Case 2.2 When ^ b: In this case, we get an edge cover with cardinality smaller than 
\Fi\ = \F\ by deleting three edges in Fi incident to b, Vk-i and Vk and add ab and Vk-iVk- 
(Note that for the case that b is covered by an edge incident to Vk-2, we use the fact that 
Vk-2 has degree at least 3 noted earlier.) So, this case is impossible as it contradicts the 
fact that F is minimum cardinality edge cover. □ 
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APPENDIX 

A Edmonds-Karp-Tomizawa algorithm for weighted bipartite match- 
ing 

In this section, we briefly explain Edmonds-Karp-Tomizawa (EKT) algorithm. The algorithm starts 
with an empty matching M and iteratively augments (i.e., increases the size of) M. The matching 
in each iteration is maintained so that it is extreme; i.e., it has highest weight among matching of 
the same cardinality. The augmenting procedure is as follows. Let M be a matching maintained 
so far. Let Dm be the directed graph obtained from G by orienting each edge e in M from V to U 
with length £ e = —w e and orienting each edge e not in M from U to V with length l e = w e . Let 
Um (respectively, Vm) be the set of vertices in U (respectively, V) not covered by M. If \M\ ^ |Z7|, 
then there is a Um~Vm path. Find a shortest such path, say P, and augment M along P; i.e., set 
M = MAP. Repeat with the new value of M until \M\ = \U\. 

The bottleneck of this algorithm is the shortest path algorithm. Although Dm has negative edge 
length, one can find a potential and applying Dijkstra's algorithm on a graph Dm with non-negative 
reduced cost. The potential and reduced cost are defined as follows. 

Definition A.l. A function p : UL)V — ► R is a potential if, for every edge uv in the residual graph 
Dm, P-uv = f-uv +p(u) — p(v) is non-negative. We call I a reduced cost with respect to a potential p. 

The key idea of using a potential is that a shortest path from u to v with respect to a reduced 
cost t is also a shortest with respect to I. We omit details here (see, e.g., ([38, Chapter 7 and 
Section 17.2]), but note that we can use a distance function found in the last iteration of the 
algorithm potential, as in Algorithm 12.11 

Dijkstra's algorithm. 

We now explain Dijkstra's algorithm on graph Dm with non-negative edge weight defined by I. 
Our presentation is slightly different from the standard one but will be easy to modify later. The 
algorithm keeps a subset X of U U V", called the set of undiscovered vertices, and a function 
d : U U V — > M + (the tentative distance). Start with X = U L) V and set d{u) = for all u G Um 
and d(v) = oo for all vertex v ^ Um- Apply the following iteratively: 
1: Find u G X minimizing d(u) over u £ X. Set X = X \ {u}. 
2: For each neighbor v of u in Dm, "relax" uv: set d(v) mm{d(v),d(u) + £ U v}- 

The running time of Dijkstra's algorithm depends on the implementation. One implementation 
is by using Fibonacci heap. Each vertex v G U U V is kept in the heap with key d(v). Finding 
and extracting a vertex of minimum tentative distance can be done in an amortized time bound of 
0(log |J7uy|) by "extract-min" , and relaxing an edge can be done in an amortized time bound of 
0(1) by "decrease- key" . 

Consider the running time caused by finding a shortest path. Let n = \U U V\ and m = \E\. 
Then we have to call insertion 0(n) times, decrease-key O(m) times, and extract-min 0(n) times. 
Thus, the overall running time is 0(m + nlogn). 
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B Observation: 0(n 3 ) and 0(n 5 / 2 log(nW)) time algorithms 



Recall to the reduction from the weighted semi-matching problem to the weighted bipartite match- 
ing problem, or equivalently, an assignment problem. The reduction was shown in [5"1 1201 fT7] . We 
include it here for completeness. Given a bipartite graph G = (U U V, E) with edge weight w, an 
instance for the semi-matching problem, we construct a bipartite graph G = (U\JV,E) with weight 
w, an instance for the weighted bipartite matching problem, as follows. For every vertex v G V of 
degree deg(v), we create exploded vertices v l ,v 2 ,. . . ,v de ^ in V and let V v denote a set of such 
vertices. For each edge uv in E of weight w uv , we also create deg(v) edges uv 1 ,uv 2 , . . . ,uv de9 ^ Vi ^), 
with associated weights w uv , 2 ■ w uv , . . . , deg(v) ■ w uv , respectively. It is easy to verify that finding 



optimal semi-matching in G is equivalent to finding minimum matching in G. Figure 1(a) shows 
an example of this reduction. 

The construction yields a graph G with 0{m) vertices and 0(nm) edges. Applying any existing 
algorithms for weighted bipartite matching directly is not enough to get an improvement. However, 
we observe that the reduction can be done in 0(n 2 logn) time, and we can apply the result of Kao 
et al. in [21] to reduce the number of participating edges to 0(n 3 ). Thus, Gabow and Tarjan's 
scaling algorithm |12j give us the following result. 

Observation B.l. If all edges have non-negative integer weight bounded by W , then there is an 
algorithm for the weighted semi-matching problem with the running time of 0(n 5//2 log nW). 

This result immediately gives an 0(n 5 / 2 logn) time algorithm for the unweighted case (i.e., 
W = 1). Hence, we already have an improvement upon the previous 0{nm) time algorithm for the 
case of dense graph. 

Now, we give an explanation on the observation. If we reduce the problem normally (as in 
Section [2]) to get G, then the number of edges in G and the running time will be 0(nm). However, 
since the size of any matching in the graph G is at most \U\, it suffices to consider only the smallest 
\U\ edges in G incident to each vertex in U. Therefore, we may assume that G has 0{n 2 ) edges. 
(The same observation is also used in |21|.) 

More precisely, let E u be a set of edges incident to u in G, and R be a set of \U\ smallest edges 
of E u . If the maximum matching of the minimum weight, say M, contains an edge e G E u \ U, 
then U U {e} has \U\ + 1 edges implying that there is an edge e' £ U incident to a vertex v G V 
not matched by M. Thus, we can replace e with e 1 which results in a matching of smaller weight. 
Therefore, we need to keep only \U\ 2 edges in our reduction. 

Moreover, we can also reduce the time for a reduction to 0(n 2 log n) as well. The faster reduction 
can be as follows. For each vertex u G U, we first add all edges incident to u, to a binary heap 
H with addition information i = 1, say (e, i). Then we iteratively extract minimum (e = uv,i) 
from H, create an edge uv 1 in E with weight i ■ w(e), and insert (uv l+1 ,i + 1) back to H. We 
repeat the process until u has \U\ incident edges in E . The pseudocode of the reduction is given 
in Algorithm lB.il 

Consider a vertex u € U. In any time during the reduction, there are 0(deg G (u)) edges in H. So, 
the extract-min takes 0(log(deg G (u))) time. The time for inserting a vertex to V and an edge to E 
is O(l) which is dominated by the time for extract-min. Thus, we have to consider only the time for 
heap operations. For each vertices u G U, we have to call insertion degg(n) + \U\ times and extract- 
min \U\ times. Thus, the time required to process each vertex of U is 0((deg G (n) + \U\) log \ U\). 
It follows that the total running time of the reduction is 0((|2£| + \U\ 2 ) log \U\) = 0(n 2 logn). 
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Algorithm B.l Reduction (G = (U U V,E),w) 

1: Create an empty set E, V. 

2: for all vertices u G U do 

3: Create a binary heap H. 

4: for all edges e incident to u do 

5: Insert (e, 1) to H with key w(u). 

6: end for 

7: for k <- 1 to |I7| do 

8: Extract-min from iif, resulted in (e = uv,i). 
9: Insert a vertex i;* to V (If not exists). 
10: Insert an edge uv l to E. 

11: Insert an edge (e' = uv %Jrl ,i + 1) to H with key u)(e) ■ (i + 1). 
12: end for 

13: Destroy a binary heap H. 
14: end for 

15: Return G = (U U V, E). 



Now, we run algorithms for bipartite matching problem on the graph G with n 2 edges. Using 
Edmonds-Karp-Tomizawa, the running time becomes 0(nm) = 0(n 3 ). Using Gabow-Tarjan's 
scaling algorithm, the running time becomes 0{^/nm\og{nW) = 0(n 5//2 log (nW)), where W is 
the maximum edge weight. 

C Dinitz's blocking flow algorithm 

In this section, we will give an outline of Dinitz's blocking flow algorithm [7J. Given a network R 
with source s and sink t, a flow g is a blocking flow in R if every path from the source to the sink 
contains a saturated edge, an edge with zero residual capacity. A blocking flow is usually called a 
greedy flow, since the flow cannot be increased without any rerouting of the previous flow paths. 
In a unit capacity network, depth-first search can be used to find blocking flow in linear time. 

Dinitz's algorithm works in layer graph, a subgraph whose edges are in at least one shortest 
path from s to t. This condition implies that we only augment along the shortest paths. The 
algorithm proceeds by successively find blocking flows in the layer graphs of the residual graph of 
the previous round. The following is an important property (see, e.g., [21 [381 US] for proofs). It 
states that the distance between the source and the sink always increases after each blocking flow 
step. 

In the case of unit-capacity, Even-Tarjan |10j and Karzanov [22] showed that the algorithm 
finds a maximum flow in time 0(min{n 2 / 3 , m 1 / 2 }™). In the case of unit-network, i.e., every vertex 
either has indegree 1 or outdegree 1, the algorithm finds a maximum flow in time 0(y/nm). 
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